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(54) System and method for generating file updates for files stored on read-only media 



(57) This Invention relates to file updating methods, 
and more pa<1icularly to f Oe updating methods for files 
stored on Read-Only media. To represent modifications 
to a set of data objects the method according to the 
present invention comprises generating a t>aseline of a 
first version of a set of data objects; identifying the differ- 
ences between a second version of the set of data 
objects and the first version of the set of the data 
objects; generating update information corresponding 
to the identified differences, said update information 



including references to segmented portions of said 
baseline that con^esporKl to said identified differences 
regardless of location of said segmented portions in 
said baseline; and storing the update information 
whereby the update information may be retrieved and 
used to generate the second version of the said of data 
objects from the first version of the said of the data 
objects. 
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Description 

Reld off the Invention 

[0001] This invention relates to file updating meth- s 
ods, and more particularly, to file updating methods for 
files stored on read-only media. 

Background of the Invention 

[0002] Software programs and data are frequently 
dIstrDtxjted on large capacity storage media such as 
compact disc-read only memories (CD-ROM). These 
devices are preferably read-only devices to preserve the 
integrity of the data and program files stored on the 
devica Such storage devices contain multiple data and 
executable program files for an application program and 
typically include a program to install the program and 
data files on a user's computer. A common application 
for distributing a computer program and data files is to 
provide an interface program and data files lor employ- 
ees that i^e computers remotely from a central sita For 
example, a company may equip its sales force with CD- 
ROMs that contain an Interface program that retrieves 
information from the data files. The retrieved data is 
used to respond to a customer's questions on product 
availability or product specifications. 
[0003] Having timely information available to an 
organization's remote personnel or being able to pro- 
vide program patches and other improvemerrts to 
update software di^ributed to end users is important for 
customer service and product support. When the data 
or programs stored on read-only media are frequently 
updated or modified, the cost of providing the program 
or data updates on new storage mecfia such as another 
CD-ROM can be prohibitively expensive. 
[0004] Accordingly, what is needed is a w^ to 
update information stored on read-only media without 
having to produce new read-only media containing tiie 
program or data updates for distribution to a company's 
customers or remote personnel. 

Summary off the Invention 

[0005] The above limitations of previously known 
program arKi data storage read-only storage devices 
are overcome a system and method made in accord- 
ance with the principles of the present Inverrtbn. The 
method of the present inventbn includes the steps of 
generating a basis index table kJenttfying the data con- 
tent of an original file system, generating a file of modi- 
fication data blocks that may be used to modify the data 
content of the original file system, and generating a 
delta look up table for identifying the data blocte in the 
original file system and the data blocks in the file of 
modification data blocks that provide the data content 
for a new version of the original file system. The delta 
look up table arxi the ffle of modification data blocks 



may be stored for delivery to a corrputer on whteh a 
copy of the original file system is stored. The delta look 
up table and the file of modification data blocks are then 
used by the computer system on which a copy of the 
original file system is stored to provide the data content 
for a new version of the original file system in way that 
appears to provide a single file system containing the 
new versk>n of the file system. Thus, the method of the 
present invention may be used to generate data for 
updating the content of a copy of the original file system 
witiiout having to generate a copy of every file arxl data 
bkx^k for the new content of the original file system. 
[0006] Preferably, the method generates the t>asis 
index talDle by building a basis directory entry nrteta-data 
table arxJ a basis index data block tabia The basis 
directory entry meta-data table organizes the meta-data 
for each entry in a directory enumeration of the aiginal 
fOe system by entry name. Preferably, the entry name 
identifies tiie entry and its parent. 
[0007] The meta-data stored for each entry is 
known meta<lata such as file attributes. The basis index 
data bk>ck table unk^uely identifies each data block 
found within the original file system. For each unique 
data block identifier, a source file identifier that kJerrtif ies 
tiie source file for the data blocK the offset to the first 
data unit for the block within the source file, and the 
length of the data bkx^ are stored. These two tables 
may then be used to generate the files for generating a 
new version of the foe system. 
[P008] The method of the present inventbn also 
includes the steps of generating a delta directory map 
f Be to identify the structure of thte entries in the new ver- 
sion of the original file system, a delta look up table 
(LUT) file for klentifying the location of the data blocks to 
generate the files in the new version of the original file 
system, and a delta modificatton data bkx^kf ile tiiat con- 
tains the new data content for the new version of the 
original fOe system. The delta directory map ffle con- 
tains the name for the entries in the new version of the 
original file system, tiie modification status for tiie 
entries in the new version of the f Oe system, the meta- 
data for each entry having a modifnatbn status of 
"modified", "contents modified" or "new", the first look 
up table record for each f Be entry, and the nurTt>er of 
look up table records used to construct the file in the 
new version of the original file system. The delta look up 
table contains at least one LUT record for each file entry 
having a modifteation status of "contents modified" or 
"new". A LUT record identifies the source f Be containing 
the data block, the locatbn of the first data unit of the 
data block in the identified source file, the lengtti of the 
data block, and the offset of the first data unit of the data 
bkx^k in the fBe being processed. The source file identi- 
fier either identifies a fBe of the original file system or the 
modification data blockfBe for the new version. The LUT 
records for all of the f Bes in the new version of the origi- 
nal file system are stored in an LUT fOa The location of 
the first LUT record for a fBe is identified by a pointer 
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stored in the meta-data of the delta directory map file lor 
file entries having a modification status of "contents 
modified" or "new". The directory map file may also be 
used in the computer having a copy of the original fOe 
system to generate information for display or use 
regarding the structure of the new version of the original 
file system arxJ its new data content. Structure data 
includes data that is displayed in response to a directory 
enumeration command or the like. The delta nrxxjifica- 
tion data block file contains the data blocks having new 
data content for the f Oe entries of the new version of the 
original file system. As the data blocks for the new data 
content of the new versk)n of the original file system are 
stored in the delta nrKxiifk:ation data block file, a delta 
index data block table Is generated. This table includes 
a unique identifier for each data block stored in the delta 
data block modification file that has unique data con- 
tent, an identifier that indicates the version of the delta 
data block nrxxfif ication ffle that is the source f fle for the 
block, the offset to the first data unit for the data block 
arxJ the length of the data block. The delta index data 
block table is appended to the basis index data bfock 
table. 

[0009] The delta directory map file, the delta nrxxiif i- 
cation data block file and the delta look up table may be 
conpressed and stored on storage media or down- 
loaded to a computer having a copy of the original fie 
system. The downloaded delta directory map file, the 
delta OKXiif foation data block file and the delta look up 
table file are used to seamlessly regenerate a new ver- 
sion of the original file system. This regeneration of the 
new version of the original file system is done in a man- 
ner which gives the appearance that the contents of the 
device on which a copy of the original file system is 
stored have been modified, even if the device uses 
read-only media for storage of the original file system. 
Thus, the system and method of the present invention 
provide a mechanism for updating the contents of a ffle 
system without requiring the productbn of a conrplete 
file system corresponding to the new version of the pro- 
gram and/or data stored in the file system. 
[0010] Preferably, the method and system of the 
present invention may be used to generate a represen- 
tation of a new version of an original file system with ref- 
erence to the original file system and to the delte 
nrKxiif ication data block files for previous versions of the 
original fOe system. This use of prevfous versions 
reduces the anfK}unt of data to be stored in the delta 
directory map file, delta modification data block file, and 
delta look up tat)le for the latest version of an original file 
system. In this embodiment of the present invention, the 
process for generating the fDes for the new version of 
the original file system produces delta data block 
records that identify the source file for a data block as 
being either a file in the original file system, a delta nrxxj- 
rfication data block tile for a previous version of the orig- 
inal file system or the delta nxxilifk^ation data block fOe 
for the new version of the original file system. The ver- 



sion of the delta modification data bfock table containing 
a data block is determined from the delta index data 
tHock tat>les appended to the basis index data block 
table. 

5 [001 1 ] These arxl other benef ite and advantages of 
the present invention shall become apparent from the 
detailed desaiption of the inventfon presented below in 
oonjiKiction with the figures accompanying the descrip- 
tion. 

10 

Brief Description of the Drawings 
[0012] 

IS Fig. 1 is a depiction of a screen shot of a file system 
hierarchy that may be evaluated t>y the system of 
the present invention; 

Rg. 2 is a flowchart of an exenplary process tfiat 
20 generates a representation of the original fDe sys- 
tem; 

Rgs. 3A and 3B are a flowchart of an exemplary 
process that generates the look-up table file and 
£5 modification date tHock file for an update to the rep- 
resentation of the original file system generated by 
the process shown in Rg. 2; 

Rg. 4 is a flowchart of an exemplary process that 
30 generates a delta directory map file for the new ver- 
sion of the original file system from the delta direc- 
tory entry meta-data tatHe generated by the 
process shown in Rgs. 3A and 38; and 

35 Rg. 5 is a flowchart of an exemplary process that 
uses the f Oes for an update generated by the proc- 
ess shown in Figs. 3A, 38 and 4 to generate a latest 
versfon of the original file system. 

40 Detailed Description of the Invention 

[P013] Rg. 1 depicts a screen shot of a file system 
hierarchy. The hierarchy for the file system is comprised 
of a directory having a list of file entries and sulxlirec- 

45 tory entries. The sutxJirectory entries may include addi- 
tional files for the file system. Each entry in the directory 
for tiie file system hierarchy also contains meta-data. 
For the file entries the meta<lata includes known file 
meta-data such as the file name, file attritxites, and 

50 Other known file meteKJata. 

[P014] In order to generate nxxlifrcation date files 
for a file system hierarchy, the original version of the file 
system hierarchy is processed and information akx>ut 
tiie system is stored in a fDe ^stem map file. This proc- 

55 ess is depicted in Fig. 2. The process begins by 
processing the directory file for tiie highest level of the 
file system hierarchy to identify tiie entries tor the subdi- 
rectories and fOes at the highest level in the file system 
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(Block 50). For each entry, meta<tata for the entry is 
stored in a t}asls directory meta-data table (Block 54). If 
the entry is a subdirectory (Block 56), the process deter- 
mines whether another directory entry exists for 
processing (Block 90). K there is, it is processed (Block 5 
54). Othenwise, the process terminates as the basis 
directory entry meta-data table and basis Index data 
block table have been generated. 
[0015] For a file entry being processed, the file is 
segmented Into blocks of one or more fixed lengths 
(Block 60). Preferably, the block length or lengths are 
chosen so that the whole basis index data block table 
can be held in the memory of the computer. In this way 
every block of memory can be directly and efficiently 
accessed. For this reason the block length should be 
determined as a function of the available computer 
resources. 

[001 6] For each block, an iterative checksum (Block 
62) arvi then a safe checksum (Block 64) is generated. 
The iterative checksum a value that Is computed from 
the data values for each byte within a t>lock beginning at 
the first byte of the bfock to the last byte in thebfock. It 
possesses the property that an iterative checksum for a 
data tkxk comprised of the first N data units In a data 
string may be used to generate the iterative checksum 
for the next data bfock comprised of the N data units 
t)eginning at the second data byte. This is done by per- 
forming the inverse iterative checksum operation on the 
iterative checksum using the data content of the first 
data unit of the first tHock to rerrme its contritxjtion to 
the iterative checksum and then performing the iterative 
checlGum operation on the resulting value using the 
N+1 data unit tfiat forms the last data unit for the next 
data block. Thus, two data operations may be used to 
generate the iterative checksum for the next block in a 
data string in which the successive data bfocks are 
formed by using a slkilng data window in the data string. 
For example, an addition operation may be used to gen- 
erate an iterative checksum having thie property noted 
abova A safe checksum is generated by a process that 
is less likely to produce the same checksum for two 
t>locks having different data contents than the storage 
media is likely to return an inaccurate data value. A safe 
checksum generation method well known within tiie 
data communk^ation art is the MD5 checksum. The iter- 
ative and safe checksum pair for a data block form a 
checksum Identifier that is used to kientify the data 
block. The iterative checksum is not as computationally 
complex as the safe checksum so the iterative check- 
sum is a relatively computational resource efficient 
method for a determining ttmt two data blocks may be 
the sama The safe checksum may tfien t>e used to ver- 
ify that the data content of the blocks are tiie same and 
reduce the likelihood of a false positive kjentification. If 
the checksum identifier is the same as the checksum 
identifier for a data block previously stored in the index 
data block table (Block 68) then the data content of the 
data block is not unique. Thus, the data block record in 



the index data block table for the conresponding check- 
sum kjentifier adequately defines the data block being 
processed so the checksum Identifier is not stored in the 
index data block table and the process determines 
whether another data block is to be processed (Block 
82). 

[001 7] If the checksum identifier indrcates the data 
bfock content is unk^ue, the iterative checksum is stored 
as the primary key in the index data block table and tiie 
safe checksum is stored in the Index data block table as 
a qualified key (Block 70). Associated witti the check- 
sum identifier for the block is an identifier for tiie f fle from 
which tiie data block came (Block 74), the offset from 
ttie first t>yte within the file to the first byte in the data 
bfock (Block 76), and the lengtti of tiie data block (Block 
78). The source file kientifier m^ be the name of the file 
in which the data block is stored, but preferat)ly, it is a 
pointer to the meta-data in tiie basis directory entry 
meta-data table for the source file. This process of kien- 
tifying and storing infomnation atx>ut each data block in 
tiie index data block table continues (Bfock 82) until all 
of the blocks for a file entry have been processed. A 
safe checksum for the entire data content of the ffle is 
then generated and stored in the t>asis directory entry 
meta-data table (Block 84). The process continues 
(Bfock 90) until all entnes for the entire directory struc- 
ture for the original file system fiave been processed. 
The basis directory entry meta-data tat)le and basis 
index data block table, file system map file representing 
the meta-data and data oonterrt for each entry within the 
file system hierarchy is then stored on storage media 
(Bfock 96). This data fbnns the t)asellne for generating 
modification data files for updating the original file sys- 
tera 

[0018] Whenever a new versfon of a file system 
hierarchy is generated, erttier by chariging, deleting or 
adding data to a file or its meta-data or by adding or 
deleting data files to the file system, a delta modif foation 
data block file and delta look ip table may be generated 
to provide tfie update information for the differences 
k>etween the original file system hierarchy and the new 
version of the file system hierarchy. The process for 
generating the delta nxxlification data fc>lock fie and the 
delta look up table is shown in Rg. 3. That process 
begins by reacfing the directory file for the new f De sys- 
tem hierarchy and kJentifying the entries for the subdi- 
rectories and files in tiie fOe system hierarchy (Block 
100). Each entry is then processed by storing the meta- 
data for the entry In a delta directory entry meta-data 
table (Block 104). The status of the entry is ttien deter- 
mined by searching the basis directory entry nteta-data 
table for an entry having the same name under the 
same parent (Block 108). If no conresponding entry is 
located in the bssis directory entry meta-data table 
(Block 1 10), then the modification status for tiie entry in 
tiie new ffle system hierarchy is set to "new" (Block 112). 
If a corresponding entry is located in the basis directory 
entry meta-data table then tiie meta-data for the confe- 
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sponding entry is compared to the meta-dala for the 
entry in the delta directory entry meta-data table (Block 
114) and, if the metadata Is the same for both entries, 
the modtfication status is set to "unnxxJified" (Block 
11 6). If the meta<iata for the entries do not conrespond 
and the entries are not files (Block 120), the modifica- 
tion status is set to "modified" (Block 122). If the meta- 
data for the entries do not correspond and the entries 
are files, a safe checksum is generated for the data con- 
tents of the file entry in the new file system (Block 126). 
This safe checksum is compared to the safe checksum 
for the entire data content of the file stored in the basis 
directory entry m^-data table (Block 128) and if they 
are not equal, the modification status is set to "contents 
modified" (Block 130). Othenwise. the nrxxiification sta- 
tus is set to "modified" (Block 134). The modification 
status is stored in the delta directory entry meta-data 
table. This process continues until all of the entries in 
the new version of the original file system have been 
processed (Block 136). 

[0019] The basis directory entry meta-dala table is 
now searched to determine whether a corresponding 
entry exists in the delta directory entry meta-data table. 
Specifically, a directory entry in the basis directory entry 
meta-data table is selected (Block 140) and the delta 
directory entry n^-data table is searched for a corre- 
sponding entry (Block 142). If no corresponding entry is 
located, an identifier for the entry and a nrxxlification sta- 
tus of "deleted" is generated and stored in the delta 
directory entry meta-data table (Block 144). The proc- 
ess continues until all entries in the basis index directory 
entry meta-data table have been checked (Block 146). 
[0020] The process now selects an entry in the 
delta directory entry meta-data table (Block 150) and 
determines whether it has a modif icatk>n status of "new" 
or "contents modified" (Block 152). For these entries, 
look up table (LUT) recorcte are generated and data 
blocte stored in the delta modification data block file, if 
necessary. If an entry is identified as being a "new" or 
"contents nxxJrfied" entry, a sliding window of N data 
units, such as 256 bytes, is used to define data blocks 
(Block 156). As noted before, the number N must be 
one of the block sizes used to segment files in the orig- 
inal fOe system for constructing the basis index data 
block tabia An iterative checksum is computed for the 
first data block formed by the sliding window being 
placed at the first data unit of the data contents of the 
"new" or "contents modified" file (Block 158). This itera- 
tive checksum )s compared to the iterative checksums 
of the checksum identifiers stored in the basis index 
data block table to determine whether a corresponding 
entry may exist (Block 160). If no corresponding itera- 
tive checksum is found, the checksum identifier for the 
data block being processed cannot be the same as one 
in the index basis data tkxk tat)le so the first data unit 
of the data block in the sliding window is stored in a 
delta nxxjification data t}k)ckf ile (Block 162). The sliding 
window is then moved to remove the first data unit from 



the data block in the file being processed and to add the 
next data unit (Bkx:k 156). The iterative checksum for 
the data block in the slicfing window is computed (Block 
158) and compared to the iterative checksums of the 

5 checksum kJentifiers in the basis index data block 
table(Block 160). Because the iterative checksum has 
the property discussed atxive, the iterative checksum 
for each successive data block only requires calcula- 
tions to remove the contribution of the data units 

w removed from the block by moving the sliding window 
and to add the contributions of the data units added by 
nxTving the sliding window. Moving the sliding window, 
generating the next iterative checksum and comparing 
the generated iterative checksum to those for the check- 

75 sum identifiers in the t>asis index data block table con- 
tinues until a corresponding iterative checksum for one 
of the checksum identifiers is located or the numl>er of 
data units stored to the delta nrxxiif ication data block file 
corresponds to the number of data units for a data block 

20 (Block 1 72). When a data btock of nxxiif icatk>n data has 
been stored to the delta modification data block file, the 
iterative and safe checksums for the block are gener- 
ated to form a checksum kientif ier for the block (Block 
1 74). The iterative checksum and safe checksum for the 

25 bkxk of modification data are then stored as the primary 
key and qualified k^, respectivety. in a delta index data 
bkxk table associated with the new version of the origi- 
nal fae system. An identifier of the delta nxxiification 
data block file in which the data bkx^k is stored, the off- 

30 set into that f Oe that defines the kx:ation of the first data 
unit fa the data t>kx:k being processed, and the length 
of the data bkx:k being processed are ateo stored in the 
delta index data bkx:k table in association with the itera- 
tive and safe checksums (Block 176). 

35 [0021] Once an iterative checksum for a data bkx;k 
within the slkfing window corresponds to one or more 
iterative checksums in the checksum identifiers stored 
in the t>asis index data bkx^k table, the process com- 
putes the safe checksum for the bkx:k within the sliding 

40 window and compares rt to the safe checksums of the 
checksum identifiers selected from the basis index data 
bkKk table (Block 178). Only one. if any, safe checksum 
of the checksum identifiers should be the same as the 
safe checksum computed for the data tkxk. If a corre- 

45 spending safe checksum is identified, the data blocks 
are the same. The process determines whether tfie pre- 
vious data block checksum klentifier comparison indi- 
cated a corresponding checksum identifier in the basis 
index data block table was k)cated (Block 180). If the 

so previous checksum identifier comparison dki not find a 
con-esponding checksum identifier, a look up taiAe 
(LUT) record is generated for the data units stored in the 
delta modification data block fDe since the last corre- 
sponding checksum identifier was detected (Block 182). 

55 That is, all of the data following the identification of the 
last data block that is also in the basis index data block 
table Is stored in the delta data modification file and the 
LUT record for that data indicates that the data is a con- 
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tiguous block of data. The LUT record is comprised of a 
delta modification data block fQe iderrtifier, the offset 
from the first data unit In the modification data file to the 
contiguous data tkock stored In the modification data 
file, the number of data urafs in the contiguous data 5 
block stored In the nrvxlrf ication data file, and the offset 
of the data block in the file currently being processed. 
The first three data elements in the LUT to klentify the 
source file for the data block in the new version of the 
original file system and its locatk>n In that file wfiile the 
fourth data element defines the location of the data 
tkock in the file of the new version of the original file sys- 
tem. As discussed below, this permits the applicatk)n 
program that controls access to the new version of the 
original file system to not only know from where it can 
retrieve the data block but where it goes in the new ver- 
sk)n of the f Oe. 

[0022] At this point in the process, the checksum 
identifier for the data bkx^ within the sliding window has 
been kientified as being the same as a checksiffn Iden- 
tifier in the t>asis index data block table. As this block 
already exists in a file in the original version of the f 9e 
system, a different LLTT record is generated for the data 
block within the sliding window (Block 198). The LUT 
record for the data t)lock that corresponds to the check- 
sum identifier stored in the basis index data block table 
is comprised of the same source file identifier as the 
one in the basis index data block table, the same offset 
from the start of the source file, the same data block 
length stored in the k>asis index data block table, and the 
offset of the data bkx;k in the file currently being proc- 
essed. The process then continues by determining 
whether the previous LUT record for the file being proc- 
essed has a source file kientif ier that is the same as the 
one for the LUT record generated for the data block 
within the sliding window (Block 200). If it does and the 
LUT just generated is for a data block that is contiguoi^ 
with the data block identified by the previous LUT 
record, the process increases the length stored in the 
previous LUT record by the length of the data block in 
the LUT record generated for the data bkx:k just proc- 
essed and discards the new LUT record (Block 202). 
This con-esponds to the situation where contiguous 
blocks of the data in a file of the new version of the orig- 
inal file system are the same as a group of contiguous 
blocks in a file of the origirmi file system Thus, one LUT 
record can identify a source for the contiguous group of 
blocks. If the data block for the new LUT record Is not 
contiguous with the data block of the previous LUT 
record or is not from the same source file, then the LUT 
record is appended to the previous LUT record (Block 
206). If the safe checksum does not correspond to the 
safe checksum for a data tkxk having the correspond- 
ing iterative checksum, the process determines whether 
a data block of rrxxfification data has been defined 
(Block 172). The process continues until rt determines 
whether all data units in the file have been processed 
(Block 210). If moTB data units exist, the sliding window 



is nxjved by its length to capture a new data bkx:k 
(Block 21 2). If the number of remaining data units do not 
fill the sliding window (Block 214), the remaining data 
units are stored in the delta nxxJif ication data bkxk file 
(Bkxk 218) and a coresponding LUT record is gener- 
ated (Block 220). The LUT records generated for the file 
being processed are then appended to the LUT records 
for other files previously stored in an LUT file for the new 
version of the original fOe system (Bkx:k 222) arxi the 
LUT records for the file are stored in the LUT file (Block 
224). The offset for the first LUT for the f De being proc- 
essed and the number of LUT records for this file are 
then stored in the meta-data of the delta directory entry 
meta-data tat>le for the f Oe being processed (Block 228). 
The process then checks for more entries in the delta 
directory entry meta-data table to process (Block 230). 
If there are more entries the process continues (Blodc 
150). If all of the delta directory entries have been proc- 
essed, the delta index data block table is appended to 
the basis index data bkx:k table (Block 234) and the 
delta directory entry meta-data table for the entries in 
the new version of the original file system are then 
searched for any entries f^ng a nrxxiif ication status of 
"unmodified*. These entries and their meta-data are 
rerrxived unless they have a descendarrt having a mod- 
ification status other than "unrrxxlrf ied" (BkKk 238). 
[P023] In an embodiment of the present invention 
tfiat utilizes pre^ous updates provkled for the original 
ffle system, the atxive process Is modified to evaluate 
the delta index data block tables for previous versk>ns of 
the original fOe system. Specifically, the process 
searches the basis index data bkx:k tables and the delta 
index data bkx:k tables files for update versions to 
kx»te data k)locks having corresponding iterative and 
safe checksums for corresponding "new" or "contents 
nrKxlified" files in the latest versbn. Additionally, tiie 
source of data blocks may also include delta modifk;a- 
tion data files for prevtous update versk)ns of the origi- 
nal file system as well as ttie files of the original file 
system and the delta modifksation data block file for the 
latest versbn. 

[P024] The delta cfirectory entry meta-data table for 
the new versbn of the original file system generated t>y 
the process in Rg. 3 is then used by the process shown 
in Rg. 4 to generate a delta directory map file. An entry 
is selected from the delta directory entry meta-data 
table (Block 250) and an entry in the delta directory map 
fDe system is generated. The entry at least includes the 
name of the entry (Block 254) and its modification sta- 
tus (Block 256). If the nxxfifkatbn status is "new", 
"nxxirfied" or "contents nxxitf led" (Block 260). the new 
meta-data is ateo stored in the delta directory map file 
for the entry (Block 264). If tiie nxxiification status is 
"new" or "contents modified", (Block 266), the offset to 
the first LUT record for the file in the LUT file and the 
number of LUT records for the file in the LUT file are 
stored in the delta directory map file (Block 268). The 
process continues until all entries in the delta directory 
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errtry meta<lata table have been processed (Block 
270). Hie name of the new file system hierarchy, its ver- 
sion identifier, directory map file. LUTfile, and nxxfifica- 
tion data files may new be compressed for delivery to a 
system having a copy of the aiginal file system. s 
[0025] Once the compressed representation of the 
new version of the original file system is transferred to a 
computer on which a copy of the original file system 
hierarchy Is stored, it may be used to update the original 
file system. An application pro^m may be provided as io 
part of that representation to perform the process 
depicted In Rg. 5. Alternatively, the application program 
may be part of the interface program provided for 
accessing the content of the original file system hierar- 
chy such as an extension to the file system program of is 
the recipient corrputer. The program decompresses the 
representation of the new file system hierarchy and 
stores the delta directory map file, LUT file, and delta 
modification data block file in storage accessik3le to the 
computer. The process then determines whether a 20 
directory containing a delta rrxxliflcation data block fie 
for a previous version of the original file system hierar- 
chy is associated with a directory or drive containing the 
original file system hierarchy (Bkx:k 300). If there is an 
association with a directory containing a delta nxxfif lea- 2s 
tion data block file, that association is merged with an 
association between the directory where the decom- 
pressed files for the new file system hierarchy are 
stored and the drive or cfirectory where the original fie 
system hierarchy is stored (Blod( 302). 30 
[0026] The merge replaces the existing associated 
delta directory map file and LUT file with the new delta 
directory map fie and LUT file, but leaves any existing 
delta modification data block files referenced in the new 
LUTfile. In other words, when a representation of a new 35 
version of a file system hierardiy is transferred to a 
computer to update a copy of the original file system 
hierarchy, the process determines if there ta an existing 
association with a directory containing a delta mocfif ica- 
tion data block fie. If there is such an association, then 40 
that association is merged with that of the new version 
and the merge replaces the ex^ng delta directory map 
file and LUT fie with those of the new version (Block 
302). 

[0027] Alternatively, the replaced delta directory 45 
map file and LUT file from the previous association 
could instead be retained in addition to the new files. 
With this alternative, the process could allcw the user to 
select which of a numt}er of available versions of a fie 
system hierarchy is accessed when the user attempts to so 
access the original fie system hierarchy. Such a selec- 
tion mechanism provkies an accessibie archive of mul- 
tiple versions of the file system hierarchy. 
[0028] Cytherwise, an association between the drive 
or directory where the original fie system hierarchy Is ss 
stored and the directory where the downloaded decom- 
pressed f ies for the new version of the original fie sys- 
tem hierarchy is now located is generated (Block 308). 



The application program may be coi4>led to tiie operat- 
ing system of the corrputer in which a copy of the origi- 
nal fie system hierarchy and the decompressed files for 
the new version of the file system hierarchy are stored. 
In a known manner, the operating system is modified to 
detect any attenrpted access to the drive or directory 
containing the original file system hierarchy or the files 
for the new version of the file system hierarchy. In 
response to an attempted operation to change the phys- 
ical media for the original file system hierarchy (Block 
310). the application program stores a media change 
indicator (Block 314) and verifies the identity of the 
physical media when a sut>sequent attempt is made to 
access the original file system hierarchy (Block 318). If 
the physical media has changed, tiie application 
change program checks the media change indicator 
and deterntines whether the original fie system media 
m available. If it is not, the program indicates that the 
original file system hierarchy is not available for access 
by the user. Ottienwise, the access is processed. 
Attempts to write data to the drive or directory contain- 
ing the original file system hierarchy or the f ies for the 
new version of the original file system detected by the 
applk;ation program (Block 320) are not processed 
(Block 324). 

[0029] For commands attempting to interrogate the 
structure of the original file system hierarchy, the appli- 
cation program responds by buikiing data in two passes 
and presenting that data to the user. A commarxi to 
interrogate the structure of the original fie system hier- 
archy is one such as a directory enumeration command. 
In response to a structure inquiry (Bfock 328), the appli- 
cation program first retrieves the requested structure 
data from the original file system arxJ deletes tfie entries 
for which the modification status in the delta directory 
map file Is "deleted", "nrxxJified'*. "new" or "contents 
modified". The data for these entries is obtained from 
the delta directory map f De and used to modify the struc- 
ture data responsive to the structure query (Block 330). 
That is, the applk:ation program ot}talns the data to t>e 
displayed for the original file system hierarchy, deletes 
thc^e ties corresporxJing to delta directory map fie 
entries having a mocfiflcation status of "deleted", adding 
structure data for those entries in the directory map file 
having a status of "new", and modifying the structure 
data for those entries in the directory map fie having a 
status of "modified" or "contents modified". This data is 
then provided to the operating system for display to the 
user. 

[P030] For file system operations that open a fie in 
the new versfon of the aiginal file system hierarchy 
(Block 340), the application program determines 
whettier tiie modifbation status of ttie fie is "unmodi- 
fied". If it is, the operation is processed using the con- 
tents of the original file system only. Ottierwise, tiie 
application program constructs and returns an open file 
handle that kientifies tiie fie (Block 344). The open fie 
handle identifies tiie file for subsequent file operation 
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commands but does not open any underlying file. For 
any file system operation command that inten-ogates 
the properties of a file for which an open file handle 
exists, the application program returns data from the 
delta directory map file entries that correspond to the 
file identified by the open f Oe handle. 
[0031] In response to an I/O operation command 
that reads data from a file identified by an open file han- 
dle (Block 350). the application program constructs a 
response to the query by identifying the LUT record in 
the LLTT file that corresponds to the start of the 
requested data (Block 352). If the underlying file refer- 
enced in the LUT record is not opened, the application 
program opens the underlying file arxJ associates it with 
the open ffle hancfle. The program then reads from tiie 
LUT record whether the data for the requested data 
block is to be read from the original file system hierarchy 
or one of the delta modification data blockf lies. After the 
source file is Identified, the offset data and data bkx;k 
length are used to locate the first byte to be transferred 
from the identified source file and the number of bytes to 
be transferred, respectively. The corresponding number 
of bytes are transferred from the source file to a 
response being built (Bk)ck 356). If additional data is 
required for the response (Block 360). the next LUT 
record is used to extract data for the response (Block 
364). Th^ process continues until the data transferred 
for an LUT record provides all of the data requested or 
until the last entry for the file is reached. The response 
built from the trar^fer of data from tiie source files iden- 
tified by the LUT records is then provided to the operat- 
ing system for delivery to the requesting program (Block 
368). In this manner, a response is provided to a file 
system operation that appears to t>e the result of a sin- 
gle contiguous read operation. In response to a file sys- 
tem operation that closes a data file (Block 370), the 
application program doses all conesponding files in the 
original fie system hierarchy and the data files for the 
new f De system hierarchy (Block 372). 
[0032] In the atxive description, a delta Index data 
block table is constructed to contain a delta index data 
block record for each new block of modification data 
(Block 176). When all the delta directory entries have 
t>een processed, the delta index data block table is 
appended to the basis index data bkx^k table (Block 
234). 

[0033] Alternatively (at Block 1 76), tiie basis index 
data bkx:k table could be ipdated to contain a t>asis 
index data bk)ck record for each new tHock of modifica- 
tion data as that bkx:k of irxxJif ication data is processed. 
In this way, there would be no delta index data block 
tat>le and the step at Block 234 wouki be eliminated. 
[0034] With tills attemative, the new version of the 
file system hierarchy can contain new or nxxJif led files in 
which the same new block of rrxxfif rcation data appears 
wore than once, but the generated representation of the 
new version of the original f Oe system hierarchy will only 
contain a single copy of the new t>lock of modification 
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data. A significant special case of this is where tiie orig- 
inal file system hierarchy is errpty and the metiiod of 
tills invention then generates an efficientiy compressed 
representation of a file system hierarchy 

5 [0035] The method of the invention Is described for 
versions of a file system hierarchy and the files hoMing 
data contents. The method m^ also be applied to any 
structure of identified objects which contain data. One 
further example of such a hierarchy ts a Directory Serv- 

10 ices hierarchy representing objects used to manage a 
computer network. Otiier similar examples wouM be 
obvious to those skilled in tiie art 
[0036] The method of tiie invention generates a 
compact representation of the differences between an 

IS original version of a file system hierarchy and an 
updated version of the fOe system hierarchy, and alkiws 
tiie regeneration of the updated version of the file sys- 
tem hierarchy from the original version of a file system 
hierarchy using that generated representation. There 

20 are many Other uses of such a method. One such use Is 
to back up a fOe system hierarchy or updates to the file 
system hierarchy to allow that version to be restored at 
a later date. In this case, the sequence of generated 
representations of the differences between the versions 

25 of the file system hierarchy ooukJ be used to restore any 
version. Otiier similar examples wouki be obvious to 
those skilled in the art. 

Claims 

30 

1 . A metiiod for representing modifications to a set of 
data objects corrprising: 

- generating a baseline of a first version of a set 
35 >^ of data objects; 

• Identifying tiie differences t>etween a second 
version of the set of data objects and the first 
version of the set of data objects; 

- generating update information corresponding 
40 to the Identified differences, said update Infor- 
mation including references to segmented por- 
tions of said baseline that correspond to said 
klentified differences regardless of location of 
said segmented portk)ns In said baseline; and 

45 - storing the update information whereby the 
update Information may be retrieved and used 
to generate the second version of the set of 
data objects from the first version of the set of 
data objects. 

50 

2. The method of daim 1 , saki baseline being gener- 
ated by forming a basis Index data block tat)le file 
system map from the first version of the set of data 
objects. 

55 

3. The method of daim 2, said baseline generation 
further includes forming a basis directory entry 
meta-data table from the first version of the set of 
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data objects. 

4. TTie method of claim 1 wherein said first version of 
said set of data objects is a directory hierarchy of a 
file system and the data objects are files. s 

5. The method of daim 1 wherein said first version of 
said set of data objects Is a directory services hier- 
archy of data objects used to manage a computer 
network io 

6. The method of daim 1 said generation of said base- 
line further comprising: 

- selecting a data object of said first version of is 
said set of data ot)jects; 

- generating and storing identifying data con-e- 
sponding to the identrty of said selected data 
object; ^ 

- segmenting each selected data object Into 20 
blocks to form segnnented portions; 

- determining whether the data content of each 
block is unique with respect to other data 
blocks segmented from data ot)jects previously 
selected; ss 

- storing data Identifying the corrtent of the data 
block and its location within a corresponding 
data object to form said segmented portions for 
said baseline; 

continuing the selection, segnf^entation, deter- 30 
mination and storing of data for data blocks 
within said data objects of said first version of 
the set of data objects to generate the baseline 
until all data objects within saki first versk>n of 
saki set of data objects have been selected. 3s 

7. The method of daim 6 wherein a size of said bkKks 
into which sakl selected data objects are seg- 
mented is a length determined with reference to the 
available computer resources. 40 

a The method of daim 6 or 7 wherein said data iden- 
tifying the content of data blocks arid their locations 
are stored within a baste index data block table f Oe 
system map. 45 

9, The method of daim 8 wherein saki klentifying data 
corresponding to each data object are stored in a 
meta-data table of said baseline. 

50 

10. The method of claim 1 wherein said differences 
between said first and said second versions are 
kientifiedby 

- identifying new data objects in sakf second ver- ss 
sion of saki set of data objects by determining 
whether data objects in saki second version of 
saki set of data objects are in said first version; 
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- identifying nxxiif led data objects in saki second 
version of saki set of data objects by determin- 
ing whether sad data objects in both said first 
and second versk>ns are the same; 

- identifying deleted data objects by determining 
the absence of saki first version data objects in 
said second version of saki set of data objects; 
and 

- storing data kientifying saki new data objects, 
said nrxxiified data objects and said deleted 
data objects in said update information. 

1 1 . The method of daim 6 wherein a size of said blocks 
into which said selected data objects are seg- 
mented is a fixed length. 

12. The method of claim 10 wherein saki generation of 
said update information indudes: 

- generating a delta nmlification data block file; 
and 

- generating a delta fook-up table. 

13^ The method of daim 12 wherein said identification 
of new data objects, modified data objects and 
deleted data objects indudes: 

- comparing metadata for data objects in the 
second version of the set of data objects to 
meta-data for data objects in saki baseline to 
determine whether a data object in saki second 
version of saki set of data objects is a new data 
object or a nxxiified data object; and 

- comparing meta-data for data objects in saki 
baseline to metadata tor data objects in saki 
second version of saki data objects to deter- 
mine whether one of saki data objects in said 
first version of said data objects is absent in 
said second set of saki data objects. 

14. The method of daim 13 wherein saki meta-data for 
data objects in the first versfon of tt)e set of data 
objects are stored in a basis cfirectory entry meta- 
data table; and 

- said metadata for data objects in the second 
version of the set of data objects are stored ina 
delta directory entry metadata table. 

1& The method of daim 14 further comprising: 

- generating a delta directory map file from sad 
delta directory entry m^adata table and sad 
delta look-up tat>le. 

16. The metfiod of claim 1 further comprising: 

- segmenting data objects in said second ver- 
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sion; 

- iderrtifying whether said segmented portions of 
said data objects correspond to segmented 
portions of said data objects in said baseline; 
and 5 
storing each said data block arxJ a correspond- 
ing identifier for each said segmented portion 

in said second version for which no corre- 
sporxiing segmented portion was identified in 
said baseline. io 

17. The method of claim 16 further comprising: 

- identifying a contiguous segmented portion in 
the secorxl version of said set of data objects is 
that corresponds to a contiguous segmented 
portion in said t}asellne. 



18. Hie method of daim 16 wherein said storage of 
said identifiers includes: so 

generating an iterative checksum from a seg- 
mented portion of said data object; 

- generating a sale checksum from said seg- 
mented portion of saki data object; and 25 
forming the identifier from saki iterative check- 
sum and saki safe checksum. 

19. The method of claim 16 wherein sakJ kJentification 

of saki corresponding segmented portions in saki 30 
first and said second versions of saki sets of data 
objects include: 

- comparing an iterative checksum for said seg- 
mented portion In said second version of saki 3S 
set of data objects to an iterative checksum for 
said segmented portion in saki baseline: and 
comparing a safe checksum for said seg- 
mented portion in saki second versk>n of saki 

set of data objects to a safe checksum for saki 40 
segmented portion in said baseline in response 
to said iterative checksum conparison indicat- 
ing saki iterative checksums correspond. 

20. The method of daim 1 further comprising: 45 

identifying differences between a new version 
of the set of data otsjects and the k>aseline and 
the update information for all intervenirig ver- 
sions of the set of data objects; 50 
generating update information conesporxiing 
to the identified differences; and 

- storing the update information whereby tiie 
update information may be retrieved and used 

to generate tiie new versbn of the set of data ss 
objects from tiie baseline and the update infor- 
mation of the intervening and new sets of data 
objects. 



21. The method of daim 20 wherein said generation of 
update information may include references to seg- 
PDented portions of saki baseline or to the update 
information for any intervening versbn of the set of 
data objects, saki references being stored in said 
update information for saki new version so that seg- 
mented portions of said new version of the set of 
data objects that occurred in the baseline or any 
intervening version of the set of data objects are not 
stored in saki update information for the new ver- 
sion of the set of data objects. 

22. The method of daim 20 or 21 wherein said genera- 
tion of i^xiate information may Indude references 
to segmented portions or to the update information 
for said new version of the set of data objects, said 
references being stored in said i|xiate information 
for said new version so that segmented portions of 
said new versbn of tiie set of data objects in said 
new version of the set of data objects are not stored 
more than once in saki update information for the 
new version of the set of data objects. 

23. A mettiod for providing data assodated with a data 
object contained witttin an updated set of data 
objects stored on a computer tiiat indudes: 

- representing a hierarchical set of data objects 
witii a t>aseline conresponding to an original 
version of the hierarchical set of data objects 
and update information corresponding to differ- 
ences l^etween tiie original version and a new 
version of the hierarchk;al set of data objects; 

- responding to a data access operation request- 
ing access to a data object within the hierarchi- 
cal set of data objects designating one of the 
baseline and the update information as a 
source for at least a portion of the rec^ested 
data ok>jeGt and kientifying the data to be 
retrieved from the source that corresponds to 
the data object; 

- retrieving the kientified data from the desig- 
nated source; and 

continuing to designate one of the t}aseline arxi 
the Lfxiate information as the source for addi- 
tional portions of the requested data object to 
identify tiie data to be retrieved from the desig- 
nated source and to retrieve the identified data 
until all data for tiie requested data object has 
been retrieved whereby data objects requested 
by data access operations may be represented 
by data stored in both the t>aseline and the 
update information representing the hierarchi- 
cal set of data objects. 

24w The method of daim 23 where multiple versions of 
said update intormation may be stored and any one 
such version may be selected at any time to deter- 
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mine which updated version of said hierarchical set 
of data objects is retrieved in response to said data 
access operation. 



content of the data objects being separate from 
the structure data. 



25. The method of daim 23 or 24 in which the response s 
to a data access operation further includes: 



31. The method of dalm 30 wherein the storage of 
structure data includes storing of directory hierar- 
chical data; and 



determining whether the data access operation 
accesses a data object represented only by the 
baseline; and 



10 



- the storage of data content of the data objects 
includes storing of data within the data objects 
in the hierarchical set of data objects. 



generating an identifier for a data ot)ject identi- 
fied by tiie data access operation in response 
to a determination that the data object is repre- 
sented by the t>aseline and the update Informa- 
tion. 



15 



26. The method of claim 23 which further includes: 

- not processing data access operations that 
attempt to write data to the baseline or the 20 
update information. 

27. The method of claim 23 which further includes: 

- responding to a structure inquiry by retrieving 25 
structure data from the baseline; and 

- modifying the structure data retrieved from the 
baseline with the update information for the 
new version of the hierarchical set of data 
objects. 30 

28. The method of daim 27 which further indudes: 

identifying a modification status in the update 
information for data objects id^itified by the 35 
structure inquiry; 

- deleting the structure data for data objects hav- 
ing a deletion nruxiiftcation status; 

adding the structure data in the update infor- 
mation lor data objects having an added nrxxii- 40 
f ication status; and 

- nxxJifying the structure data with data from the 
update infonnation for data objects having 
nrxxlffied status. 

45 

29. The method of claim 23 which further indudes: 

storing the update information in a delta direc- 
tory map file, a look-up table, and at least one 
delta nrxxlif ication data tAock file. 50 

30. The method of claim 23 which furtiier indudes: 

storing structure data for the set of hierarchical 
set of data objects in the baseline and the 55 
update information; and 
storing data content of the data objects in the 
baseline and the update information, the data 
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